Uploaded image for project: 'Kudu'
  1. Kudu
  2. KUDU-1451

Restarting a TS that had a lot of deleted tablets takes tens of minutes

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • 0.8.0
    • 0.9.0
    • tablet
    • None

    Description

      Running a workload that deletes and creates a new table with hundreds of tablets every few hours, I encounter this issue where if I haven't restarted the cluster in a while it'll take tens of minutes to process all the deleted tablets. The logs for a single tablet looks like:

      I0511 19:03:52.380903 79512 ts_tablet_manager.cc:609] Loading metadata for tablet 28f7ae54ac1d413b8a0e694e1dcef0fc
      I0511 19:03:52.391885 79512 ts_tablet_manager.cc:937] T 28f7ae54ac1d413b8a0e694e1dcef0fc P d87c4ff7b7124cf8839940b71ed1704d: Tablet Manager startup: Rolling forward tablet deletion of type TABLET_DATA_DELETED
      I0511 19:03:52.391894 79512 ts_tablet_manager.cc:964] T 28f7ae54ac1d413b8a0e694e1dcef0fc P d87c4ff7b7124cf8839940b71ed1704d: Deleting tablet data with delete state TABLET_DATA_DELETED
      I0511 19:03:52.497952 79512 ts_tablet_manager.cc:974] T 28f7ae54ac1d413b8a0e694e1dcef0fc P d87c4ff7b7124cf8839940b71ed1704d: Tablet deleted. Last logged OpId: 406.65248
      I0511 19:03:52.498010 79512 ts_tablet_manager.cc:946] T 28f7ae54ac1d413b8a0e694e1dcef0fc P d87c4ff7b7124cf8839940b71ed1704d: Deleting tablet superblock
      

      In my latest instance of this problem, running on 43c9c87604f3b6f3dd286c63344bf18a2db08c21, it took almost 20 minutes to process 18k deleted tablets... then the TS can start bootstrapping.

      Attachments

        Issue Links

          Activity

            People

              mpercy Mike Percy
              jdcryans Jean-Daniel Cryans
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: